Search CORE

92 research outputs found

Supervised Attentions for Neural Machine Translation

Author: Ittycheriah Abe
Mi Haitao
Wang Zhiguo
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and alignment qualities significantly over the large-vocabulary neural machine translation system, and even beats a state-of-the-art traditional syntax-based system.Comment: 6 pages. In Proceedings of EMNLP 2016. arXiv admin note: text overlap with arXiv:1605.0314

arXiv.org e-Print Archive

Crossref

Statistical Translation Model Based On Source Syntax Structure

Author: Liu Qun
Liu Yang
Mi Haitao
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Waseda University Repository

Research on Feature Extraction of Indicator Card Data for Sucker-Rod Pump Working Condition Diagnosis

Author: Haitao Shi
Lifei Mi
Yunhua Yu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Three feature extraction methods of sucker-rod pump indicator card data have been studied, simulated, and compared in this paper, which are based on Fourier Descriptors (FD), Geometric Moment Vector (GMV), and Gray Level Matrix Statistics (GLMX), respectively. Numerical experiments show that the Fourier Descriptors algorithm requires less running time and less memory space with possible loss of information due to nonoptimal numbers of Fourier Descriptors, the Geometric Moment Vector algorithm is more time-consuming and requires more memory space, while the Gray Level Matrix Statistics algorithm provides low-dimension feature vectors with more time consumption and more memory space. Furthermore, the characteristic of rotational invariance, both in the Fourier Descriptors algorithm and the Geometric Moment Vector algorithm, may result in improper pattern recognition of indicator card data when used for sucker-rod pump working condition diagnosis

Crossref

Directory of Open Access Journals

Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation

Author: de Melo Gerard
Hu Xiang
Li Liang
Mi Haitao
Publication venue
Publication date: 02/11/2022
Field of study

Recently CKY-based models show great potential in unsupervised grammar induction thanks to their human-like encoding paradigm, which runs recursively and hierarchically, but requires

O(n^3)

time-complexity. Recursive Transformer based on Differentiable Trees (R2D2) makes it possible to scale to large language model pre-training even with complex tree encoder by introducing a heuristic pruning method. However, the rule-based pruning approach suffers from local optimum and slow inference issues. In this paper, we fix those issues in a unified method. We propose to use a top-down parser as a model-based pruning method, which also enables parallel encoding during inference. Typically, our parser casts parsing as a split point scoring task, which first scores all split points for a given sentence, and then recursively splits a span into two by picking a split point with the highest score in the current span. The reverse order of the splits is considered as the order of pruning in R2D2 encoder. Beside the bi-directional language model loss, we also optimize the parser by minimizing the KL distance between tree probabilities from parser and R2D2. Our experiments show that our Fast-R2D2 improves performance significantly in grammar induction and achieves competitive results in downstream classification tasks.Comment: EMNLP 202

arXiv.org e-Print Archive

Feature Extraction of Indicator Card Data for Sucker-Rod Pump Working Condition Diagnosis

Author: Haitao Shi
Lifei Mi
Yunhua Yu
Publication venue
Publication date: 24/04/2020
Field of study

CiteSeerX

Discover, Explanation, Improvement: Automatic Slice Detection Framework for Natural Language Processing

Author: Hua Wenyue
Jin Lifeng
Mi Haitao
Song Linfeng
Yu Dong
Zhang Yongfeng
Publication venue
Publication date: 08/11/2022
Field of study

Current natural language processing (NLP) models such as BERT and RoBERTa have achieved high overall performance, but they often make systematic errors due to bias or certain difficult features to learn. Thus research on slice detection models (SDM) which automatically identifies underperforming groups of datapoints has gradually caught more attention, which aims at both understanding model behaviors and providing insights for future model training and designing. However, there is little systematic research on SDM and quantitative evaluation of its assessment for NLP models. Our paper fills this gap by proposing "Discover, Explanation, Improvement" framework that discovers coherent and underperforming groups of datapoints and unites datapoints of each slice under human-understandable concepts; it also provides comprehensive evaluation tasks and the corresponding quantitative metrics, which enable convenient comparison for future works. Results show that our framework can accurately select error-prone datapoints with informative semantic features that summarize error patterns, based on which it directly boosts model performance by an average of 2.85 points based on trained models without tuning any parameters across multiple datasets.Comment: 15 pages, 5 figure

arXiv.org e-Print Archive

The Trickle-down Impact of Reward (In-)consistency on RLHF

Author: Chen Sihao
Jin Lifeng
Khashabi Daniel
Mi Haitao
Peng Baolin
Shen Lingfeng
Song Linfeng
Yu Dong
Publication venue
Publication date: 28/09/2023
Field of study

Standard practice within Reinforcement Learning from Human Feedback (RLHF) involves optimizing against a Reward Model (RM), which itself is trained to reflect human preferences for desirable generations. A notable subject that is understudied is the (in-)consistency of RMs -- whether they can recognize the semantic changes to different prompts and appropriately adapt their reward assignments -- and their impact on the downstream RLHF model. In this paper, we visit a series of research questions relevant to RM inconsistency: (1) How can we measure the consistency of reward models? (2) How consistent are the existing RMs and how can we improve them? (3) In what ways does reward inconsistency influence the chatbots resulting from the RLHF model training? We propose Contrast Instructions -- a benchmarking strategy for the consistency of RM. Each example in Contrast Instructions features a pair of lexically similar instructions with different ground truth responses. A consistent RM is expected to rank the corresponding instruction and response higher than other combinations. We observe that current RMs trained with the standard ranking objective fail miserably on Contrast Instructions compared to average humans. To show that RM consistency can be improved efficiently without using extra training budget, we propose two techniques ConvexDA and RewardFusion, which enhance reward consistency through extrapolation during the RM training and inference stage, respectively. We show that RLHF models trained with a more consistent RM yield more useful responses, suggesting that reward inconsistency exhibits a trickle-down effect on the downstream RLHF process

arXiv.org e-Print Archive